Exploring Data Sets

By: Sydney Roberts

 


 

Consider the following data set:

Capture1.JPG

 

This data is from the lumber industry, giving the approximate number of board feet of lumber per tree in a forest of a given age.

In order to fill in the missing information in the data set, we can use Excel to fit a function to the data. To get an idea of what shape this function needs, it’s a good idea to start by creating and looking at a scatterplot containing the data. This can also be done using Excel.

 

Capture2.JPG

We can tell that as the age of the tree increases, the number of board feet of lumber also increases. However, it doesn’t appear that it increases at a linear rate. Instead, it appears as if it increases exponentially. Because of this observation, we can use excel to generate an exponential function of the form .

For this data set, Excel found the best exponential fit to be  with an R2 value of 0.8968.

Capture3.JPG

 

This R2 value gives us the percentage of data that can be explained by the model function. Therefore, this R2 value tells us that only 89.68% of the given data can be explained by this exponential function. In retrospect, this isn’t a very good fit.  It appears that the model is increasing faster than the amount of board feet increases with respect to age. However, we know we still want to see a function that curves and increases as the age goes towards infinity.

Therefore, my next assumption is to try to fit a power function of the form  to the data set. Again we can use Excel for this and get that the best power function to fit the data is  with an R2 value of 0.9999 which shows that this is in fact a very good model and accounts for 99.99% of the data.

 

Capture4.JPG

 

It would be hard to find a more accurate model than this, so we will consider this the function that best fits the given data. Now that we have this function, we can use it to predict our missing values by substituting the age of the tree into our function for x, and then calculating the y-value. This y-value will be our predicted hundred of board feet for trees of that age.

For x = 60 à

For x = 140 à

For x = 180 à


For an Excel Document with the data set and function already entered, click HERE


Return